A technique for controlling voice quality of synthetic speech using multiple regression HSMM
نویسندگان
چکیده
This paper describes a technique for controlling voice quality of synthetic speech using multiple regression hidden semi-Markov model (HSMM). In the technique, we assume that the mean vectors of output and state duration distribution of HSMM are modeled by multiple regression with a parameter vector called voice quality control vector. We first choose three features for controlling voice qualities, that is, “smooth voice – nonsmooth voice,” “warm – cold,” “high-pitched – low-pitched,” and then we attempt to control voice quality of synthetic speech for these features. From the results of several subjective tests, we show that the proposed technique can change these features of voice quality intuitively.
منابع مشابه
A style control technique for singing voice synthesis based on multiple-regression HSMM
This paper proposes a technique for controlling singing style in the HMM-based singing voice synthesis. A style control technique based on multiple regression HSMM (MRHSMM), which was originally proposed for the HMM-based expressive speech synthesis, is applied to the conventional technique. The idea of pitch adaptive training is introduced into the MRHSMM to improve the modeling accuracy of fu...
متن کاملA style control technique for speech synthesis using multiple regression HSMM
This paper presents a technique for controlling intuitively the degree or intensity of speaking styles and emotional expressions of synthetic speech. The conventional style control technique based on multiple regression HMM (MRHMM) has a problem that it is difficult to control phone duration of synthetic speech because HMM has no explicit parameter which models phone duration appropriately. To ...
متن کاملA Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMM
This paper describes a technique for modeling and controlling emotional expressivity of speech in HMM-based speech synthesis. A problem of conventional emotional speech synthesis based on HMM is that the intensity of an emotional expression appearing in synthetic speech completely depends on the database used for model training. To take into account the emotional expressivity that listeners act...
متن کاملPerformance evaluation of style adaptation for hidden semi-Markov model based speech synthesis
This paper describes a style adaptation technique using hidden semi-Markov model (HSMM) based maximum likelihood linear regression (MLLR). The HSMM-based MLLR technique can estimate regression matrices for affine transform of mean vectors of output and state duration distributions which maximize likelihood of adaptation data using EM algorithm. In this study, we apply this adaptation technique ...
متن کاملAcoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis
This paper describes the use of combined linear regression and expost MAP methods for average-voice-based speech synthesis system based on HMM. To generate more natural sounding speech using the average-voice-based speech synthesis system when a large amount of training data is available, we apply ex-post MAP estimation after the linear transformation based adaptation. We investigate how the am...
متن کامل